Cell type to cell type communication within a tissue is crucial for normal tissue functioning. Any disturbance in this communication might affect the tissue homeostasis and contribute to the development of a disease. We propose that there are three major components by which normal cell type to cell type communication can be disturbed. The first component is the abundance of the communicating cell types. In many diseases, such as acute myeloid leukemia (AML), the cell composition of the tissue is drastically changed which affects the communication paths. The second component is the fraction of cells within a given cell type which are actively using a certain ligand or receptor for communication. Even if the abundance of the cell type is unchanged per se, the expansion or shrinkage of the active fraction will affect the communication strength in the tissue. The third and most intuitive component is the expression level of a ligand or receptor in the active fraction of cells, which can either become upregulated or downregulated in the disease. Here we developed an algorithm that takes into account all three components to reconstruct the communication paths within a tissue and perform differential communication analysis between the case and control cohorts.

The input to the algorithm is a single-cell RNA-seq normalized count matrix (one per sample) and the corresponding cell type annotation matrix. The database of potential ligand-receptor interactions is provided by the algorithm and is an expanded adaptation of the iTalk database (Wang, 2019). The first step of the algorithm is to calculate the list of cell type to cell type interactions per sample. For this, the algorithm checks all pairwise combinations of the cell types in a sample, and for each combination it calculates the strength of a ligand-receptor interaction for each ligand-receptor pair in the database using a formula that covers all three components described previously. After this procedure is done for all samples, the algorithm creates an output matrix of cell type interactions for the full dataset. As the next step, the algorithm performs a differential communication analysis between the case and the control samples using the Wilcoxon test with Bonferroni adjustment.

To test the algorithm, we use two published datasets of human bone marrow single-cell RNA-seq containing samples from AML patients at diagnosis, as well as healthy donors. We selected 8 AML samples from van Galen, 2019, as a control group, we used 26 samples from healthy donors (4 from van Galen, 2019 and 22 from Oetjen, 2018). We unified the cell type annotation to use 6 major cell types: hematopoietic stem and progenitor cells (HSPCs), monocytes, erythrocytes, B cells, T cells, and dendritic cells (DC). We integrated the data using the scGen package (Lofollahi, 2019). We identified over 2500 good quality communication paths, 1449 of them being differential between the healthy and the AML samples (adjusted p-value < 0.1). We identified that the most intensively communicating cell types in the healthy samples were the T and the B-cells, whereas the self-communication in HSPC/blasts and HSPC/blast-DC communication was the most intensive in the AML. As a proof of concept, our algorithm was able to identify a CXCR3 positive subpopulation of the DC cells, which was lost in the AML. Further analysis revealed this subpopulation to be pDC cells, the loss of which has been previously described for AML.

In conclusion, we developed a tool which for the first time can perform differential communication analysis in case-control studies using single-cell RNAseq. Our algorithm reconstructs communication within tissues by considering three major communication components: the abundance of the cell types, the size of actively communicating fraction of cells, and the intensity of the ligand / receptor expression. We test our algorithm on two publicly available single-cell RNAseq datasets from human bone marrow which reveals major shifts in cell type to cell type communication in AML. This analysis is crucial to generate new data driven hypotheses for a deeper understanding of development of AML. Our algorithm will be available soon as an R package.

No relevant conflicts of interest to declare.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution